Text Mining from Free Unstructured Text: An Experiment of Time Series Retrieval for Volcano Monitoring
نویسندگان
چکیده
Volcanic activity may influence climate parameters and impact people safety, hence monitoring its characteristic indicators their temporal evolution is crucial. Several databases, communications literature providing data, information updates on active volcanoes worldwide are available, will likely increase in the future. Consequently, extraction text mining techniques aiming to efficiently analyze such databases gather data of interest a specific volcano can play an important role this applied science field. This work presents natural language processing (NLP) system that we developed extract geochemical geophysical from free unstructured included reports operational bulletins issued by volcanological observatories HTML, PDF MS Word formats. The NLP enables relevant gas (e.g., SO2 CO2 flux) text, was tested series 2839 daily weekly published online between 2015 2021 for Stromboli (Italy). experiment shows proves capable time set user-defined be later analyzed interpreted specialists relation with other geospatial data. potentially tuned target databases.
منابع مشابه
Mining criminal networks from unstructured text documents
Digital data collected for forensics analysis often contain valuable information about the suspects’ social networks. However, most collected records are in the form of unstructured textual data, such as e-mails, chat messages, and text documents. An investigator often has to manually extract the useful information from the text and then enter the important pieces into a structured database for...
متن کاملEntity Retrieval and Text Mining for Online Reputation Monitoring
Online Reputation Monitoring (ORM) is concerned with the use of computational tools to measure the reputation of entities online, such as politicians or companies. In practice, current ORM methods are constrained to the generation of data analytics reports, which aggregate statistics of popularity and sentiment on social media. We argue that this format is too restrictive as end users often lik...
متن کاملText Mining for Technology Monitoring
A considerable part of scientific and technological knowledge is coded in writing. In this context, automated text categorization can be regarded as a promising tool particularly for patent data analysis. In a real-life example, we show that automated text categorization can closely resemble the time -consuming categorisation job of an expert. By comparing different algorithms we reveal systema...
متن کاملMining Free Text for Structure
INTRODUCTION When the manager of a mutual fund sits down to write an update of the fund's prospectus, he does not start his job from scratch. He knows what the fund's shareholders expect to see in the document and arranges the information accordingly. An inventor, ready to register his idea with the Patent and Trademark Office of the U.S. Department of Commerce, writes it up in accordance with ...
متن کاملDesign and Test of the Real-time Text mining dashboard for Twitter
One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2022
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app12073503